Continuous matrix approximation on distributed data
نویسندگان
چکیده
منابع مشابه
Continuous Matrix Approximation on Distributed Data
Tracking and approximating data matrices in streaming fashion is a fundamental challenge. The problem requires more care and attention when data comes from multiple distributed sites, each receiving a stream of data. This paper considers the problem of “tracking approximations to a matrix” in the distributed streaming model. In this model, there are m distributed sites each observing a distinct...
متن کاملDistributed Adaptive Sampling for Kernel Matrix Approximation
Most kernel-based methods, such as kernel or Gaussian process regression, kernel PCA, ICA, or k-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix Kn requires at least O(n2) time and space for n samples. Recent works [1, 9] show that sampling points with replacement according to their ridge leverage scores (RLS) generates small dictionaries of r...
متن کاملInteger Matrix Approximation and Data Mining
Integer datasets frequently appear in many applications in science and engineering. To analyze these datasets, we consider an integer matrix approximation technique that can preserve the original dataset characteristics. Because integers are discrete in nature, to the best of our knowledge, no previously proposed technique developed for real numbers can be successfully applied. In this study, w...
متن کاملar X iv : 1 40 4 . 75 71 v 1 [ cs . D B ] 3 0 A pr 2 01 4 Continuous Matrix Approximation on Distributed Data
Tracking and approximating data matrices in streaming fashion is a fundamental challenge. The problem requires more care and attention when data comes from multiple distributed sites, each receiving a stream of data. This paper considers the problem of “tracking approximations to a matrix” in the distributed streaming model. In this model, there are m distributed sites each observing a distinct...
متن کاملV-Optimal Filters for Data Approximation in Continuous Data Streams
Monitoring data streams in real time over distributed streaming environments plays a large role in maintaining situation awareness, which is a great challenge due to huge data volumes and bandwidth limitations. In this monitoring process, transmission cost and value accuracy are two very important but conflicting factors in measuring the efficacy of the system. On one hand, increasing value acc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2014
ISSN: 2150-8097
DOI: 10.14778/2732951.2732954